Skip to content

Feat/device attach#21

Merged
CMGS merged 22 commits intomasterfrom
feat/device-attach
Apr 28, 2026
Merged

Feat/device attach#21
CMGS merged 22 commits intomasterfrom
feat/device-attach

Conversation

@CMGS
Copy link
Copy Markdown
Contributor

@CMGS CMGS commented Apr 28, 2026

No description provided.

CMGS added 22 commits April 28, 2026 13:03
Define the Attacher/Lister interfaces and Spec validation for two
runtime hot-attach surfaces — vhost-user-fs (typically via virtiofsd)
and VFIO PCI passthrough. Backends opt in via type assertion; CLI
gives clear errors on unsupported backends.

State semantics: attach is runtime-only, never persisted, lost on VM
stop. Cocoon does not own the backend (virtiofsd / vfio-pci binding).
Vhost-user-fs hot-plug requires CH memory shared=on; this is fixed at
VM creation. Add VMConfig.SharedMemory, plumb through CH chMemory and
the --memory CLI arg, propagate via clone/restore. Reject --fc with
--shared-memory since Firecracker has no virtio-fs.

patchCHConfig already preserves memory.shared via raw-JSON merging,
so clone/restore round-trip without further work.
Introduce chFs / chDevice / chPciDeviceInfo types matching CH OpenAPI,
extend chVMInfoConfig to surface fs and devices arrays, and add
addFsVM / addDeviceVM / getVMInfo / decodePciDeviceInfo helpers.

vmAPICall extracted from vmAPI so callers needing the response body
share the existing retry logic. Vhost-user-fs and VFIO add use the
returned PciDeviceInfo to record the device id.
FsAttach: validate spec, assert VM running and memory.shared=on,
detect tag/id collision via vm.info, call vm.add-fs with id derived
from tag (cocoon-fs-<tag>) so concurrent attaches collide on CH's
id-uniqueness check.

DeviceAttach: normalize PCI input to canonical sysfs path, assert it
exists and is a directory, detect path/id collision via vm.info, call
vm.add-device. Detach common: vm.remove-device.

FsList / DeviceList read vm.info and translate into the
runtime-only inspect DTOs from the extend packages.
cocoon vm fs    attach/detach VM --socket / --tag / [num/queue size]
cocoon vm device attach/detach VM --pci / [--id]

Handlers type-assert to extend/fs.Attacher and extend/vfio.Attacher;
unsupported backends (Firecracker today) report a clear error
naming the backend. ErrNotRunning is unwrapped into a friendlier
message so users see why the call failed.
Wrap types.VM in an inspect-only DTO that adds attached_devices when
the VM is running and the backend implements fs.Lister / vfio.Lister.
Listing failures are logged and dropped: inspect should not fail just
because vm.info is briefly unreachable.

Cocoon never persists attached devices, so types.VM stays unchanged.
README: new 'Runtime Device Attach' section, --shared-memory flag,
vm fs / vm device subcommands in the CLI tree.

KNOWN_ISSUES: shared-memory creation prereq, CH rejection of
snapshot when fs/vfio attached, runtime-only lifecycle (lost on
stop/clone/restore).
- runningVMClient: switch to utils.VerifyProcess for PID-reuse safety
  (cocoon convention; bare IsProcessAlive can talk to a reused PID)
- vfio.Spec: collapse Validate+NormalizePath into NormalizedPath, removing
  the redundant regex match in DeviceAttach
- bdfFromSysfsPath: return empty when CH reports a non-PCI host path
  rather than echoing the raw path back as a BDF
- chDevice: drop unused IOMMU field; v1 does not expose --iommu
- FsAttach: clearer shared-memory error referring to creation flag
- queryConsolePTY: use shared getVMInfo helper
- commands.go: extract attachGroup to fold buildFsCommand and
  buildDeviceCommand cobra builders into a single shape
runningVMClient was reading rec.PID from the VM record, but cocoon never
stores the live PID there — it lives in the runDir/ch.pid file. The
existing canonical gate (Backend.WithRunningVM) reads ReadPIDFile +
VerifyProcessCmdline; mirror that pattern so attach/detach see a real
PID instead of always 0 (which caused 'pid 0 not cloud-hypervisor' on
healthy VMs).

DeviceAttach: reorder os.Stat after runningVMClient so a stopped VM
reports the VM-state error instead of a misleading host-path error.
Fs and VFIO hot-plug skeletons collapse into three shared functions
with per-call closures supplying the spec validation, conflict scan,
device-id extraction, and CH endpoint. The four backend methods
(FsAttach/FsDetach/FsList + DeviceAttach/DeviceDetach/DeviceList)
shrink from independent ~30-line bodies to thin specs.

addFsVM / addDeviceVM removed: attachWith now marshals + decodes
PciDeviceInfo inline so the only call sites that needed those
helpers no longer exist.

Inline TODO on FsList / DeviceList notes the dual vm.info round-trip
on inspect; consolidating into one combined Lister is left for
follow-up since it requires a cross-package interface.
…pers

attachWith / detachWith / listWith all opened with the same pair —
runningVMClient + getVMInfo — so factor it into one inspectRunning
call. The change is mechanical: each helper goes from two-step setup
to one, and the live-VM gate stays on a single code path.
vfio.NormalizePath now rejects absolute paths that fall outside
/sys/bus/pci/devices/ and absolute paths whose suffix is not a valid
full BDF. Previously cocoon would happily forward /dev/null or any
non-PCI directory to vm.add-device. Tests flip the /dev/null case from
PASS to expected rejection and add coverage for sysfs-prefixed garbage
and short-form BDFs under the prefix.

cmd/vm/debug now threads VMConfig.SharedMemory into printCommonCHArgs
so 'cocoon vm debug --shared-memory' actually prints --memory ...,
shared=on. The check that rejects --fc + --shared-memory in create/run
is mirrored on the debug path so the contract is the same across the
three entry points.

Backend-layer error messages also drop the '--flag' prefix in fs and
vfio Spec validation; the CLI flag terminology belongs in the cmd/vm
layer, not the extend packages.
…cycle

README: spell out which inputs vm device attach --pci accepts now that
NormalizePath rejects arbitrary absolute paths. The previous 'BDF or
full sysfs path' phrasing was loose and could read as 'any path'.

KNOWN_ISSUES: add a note that upstream virtiofsd exits after one
client disconnect, so scripted attach/detach cycles must respawn the
daemon between calls. Caught during E2E.
vmAPIOnce: vm.add-fs and vm.add-device are non-idempotent; a retry
after CH has already accepted the device but the response was lost
would surface as a 'duplicate id' rejection. Skip DoWithRetry on those
two endpoints. Existing addDiskVM/addNetVM stay on the retry path
(deterministic startup, no user-visible CLI surface today).

chFs: align num_queues / queue_size with chDisk and chNet by adding
omitempty so a zero value (e.g. from a programmatic caller that
skipped Spec.Validate) is dropped rather than rejected by CH.

debug: collapse printCommonCHArgs's 6 positional args into a chDebugArgs
struct so future flags don't keep widening the signature.

lifecycle: mirror the TODO(inspect) note from cloudhypervisor/extend.go
so the dual vm.info round-trip is visible at the call site too.

extend.go: flatten the 'else if' chain after a return path in
DeviceAttach (style).
… error

cmd/vm/commands.go: drop attachGroup struct + build() in favor of two
inline buildFsCommand/buildDeviceCommand bodies. The struct's only job
was to group config strings and flag callbacks for two callers, but
their flag setup never actually overlapped — the indirection added
~30 lines without dedup'ing anything. Plain cobra construction reads
top-to-bottom now.

extend/fs: rename Spec.Validate to Spec.Normalize. The function applies
defaults to the receiver, so the old name lied about purity. Mirrors
vfio.Spec.NormalizedPath in spirit.

cloudhypervisor/extend.go runningVMClient: surface a missing pidfile
as 'read pidfile: <err>' instead of silently degrading to 'pid 0'.
Three files I introduced/touched had standalone funcs interleaved with
struct methods. SKILL.md mandates 'methods grouped, then standalone
functions at the bottom':

- cloudhypervisor/extend.go: listWith was placed between detachWith and
  runningVMClient methods. Moved to bottom alongside bdfFromSysfsPath.
- cmd/vm/lifecycle.go: collectAttachedDevices was inserted between
  Inspect and Console handler methods. Moved to bottom of the file.
- cmd/vm/debug.go: chDebugArgs type was declared between printCHDebug
  and printCommonCHArgs. Moved to top with the other type
  declarations.

Self-review missed these during the /code walkthrough; flagged by the
user.
… DoAPIOnce

splitSuccessCodes returned a (primary, alt) tuple that both callers
fanned right back out into doAPIAcceptingAlt — three helpers for what
is one operation. Inline into a single doAPI(...successCodes) that
hides the default + alt-tolerance, and have DoAPIWithRetry / DoAPIOnce
call it directly.

snapshot/restore on both backends now uses DoAPIOnce instead of raw
DoAPI. Behavior is unchanged (no alt codes, no retry either way), but
the named helper makes the 'this is non-idempotent, do not retry'
intent uniform with vm.add-fs / vm.add-device which already use
vmAPIOnce. Comments updated to explain the *why* (multi-GiB memory
transfer, partial state.json risk) instead of citing the old
fcAPI/vmAPI retry layer.
vm.remove-device is non-idempotent: if CH detaches the device but the
response is lost, the DoAPIWithRetry retry hits 'id not found' and
the user sees a failure for an operation that actually succeeded.

This is the same failure mode the previous review pass fixed for
vm.add-fs / vm.add-device. Route removeDeviceVM through vmAPIOnce
for consistent semantics across attach and detach hot paths.

Callers are clone.go's hotSwapNets (admin path), extend.go's
FsDetach and DeviceDetach (user-initiated CLI). Last two are the
ones that surface the bad UX directly.
…*APIOnce

Same correctness reasoning as removeDeviceVM and the earlier vm.add-fs /
vm.add-device pass: a retry after a state-mutating endpoint already
succeeded but the response was lost surfaces as a wrong-state error
("already paused", "already shut down", "duplicate id") and masks
the original success.

Migrated to vmAPIOnce / fcAPIOnce:

  CH  shutdownVM, pauseVM, resumeVM (used by stop and snapshot/restore)
  CH  addDiskVM, addNetVM            (used by clone after vm.restore)
  FC  pauseVM, resumeVM              (PATCH /vm)
  FC  instanceStart                  (PUT /actions InstanceStart)

Kept on retry-enabled fcAPI / vmAPI (pre-boot config or semi-idempotent):

  FC  putMachineConfig, putBootSource, putDrive, putNetworkInterface,
      putBalloon, putEntropy   — pre-boot PUT, retry overwrites cleanly
  FC  sendCtrlAltDel           — repeating a key event is harmless
  CH  powerButton              — ACPI button press, no-op if already off

Side effects:

- fcAPI loses its method param (always PUT now) and successCodes
  (always 204). Trim the signature to keep unparam happy. If a future
  caller needs alt codes, route through DoAPIWithRetry directly.
- New fcAPIOnce keeps the method param because pause/resume use PATCH
  while instanceStart uses PUT.
Cut WHAT-restating godoc and inline comments per SKILL.md (default to no
comments; only keep WHY when non-obvious). Net -146 LOC across 15 files;
no behavior changes.
Consolidate types at top, group all Backend public methods, then private,
then standalone funcs. Eliminates the 15 layout violations introduced when
LaunchVMProcess / RestoreSequence / DirectRestoreSequence were appended at
end of file in the recent template refactor.

No behavior change.
Aggregate findings from 5 parallel audit agents (ordering/types/naming,
errors+modernization, reuse, quality, efficiency) across 156 non-test files.
All actionable items applied; net -3 LOC.

Reuse / templates:
- hypervisor.PreflightRestore template: load+validate sidecar, integrity
  callback, ValidateRoleSequence. CH/FC preflightRestore become 3-line
  delegations.
- cmd/core.resolveOwner generic: ResolveImageOwner and resolveVMOwner
  collapse onto one helper parameterized by found-callback + sentinels.
- firecracker.putJSON generic: 5 putXxx helpers collapse to thin wrappers.

Efficiency:
- snapshot/localfile.Delete batches the per-id store.Update into a single
  transaction (was N+1 flock cycles on M-ref delete).
- cmd/vm.recoverNetwork uses one List + map lookup instead of M Inspect
  calls under DB lock.
- network/cni: drop redundant LinkByName re-fetch when overrideMAC is set
  (already known value).

Style / dead code:
- Remove dead vmAPI / vmAPICall (unused after the *APIOnce migration).
- Remove duplicate TODO(inspect) in cloudhypervisor/extend.go.
- Drop firecracker PR-number reference from clone.go (per comment hygiene).
- Restore exec-justification comment in cloudhypervisor/start.go.
- utils/tar: errors.Is(err, io.EOF) instead of == comparison.
- strconv.Itoa / FormatInt instead of fmt.Sprintf("%d", ...).
- cloudhypervisor/snapshot: cowName via filepath.Base(cowPath) instead of
  branched literal.
- cmd/vm: vmID[:8] → network.VMIDPrefix.

Layout:
- network/bridge/bridge_other.go: var before type.
- images/cloudimg/inspect.go: var before type.

make lint dual-platform 0 issues; make test all pass.
@CMGS CMGS merged commit 10113f1 into master Apr 28, 2026
4 checks passed
@CMGS CMGS deleted the feat/device-attach branch April 28, 2026 08:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant